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Using  CAP  resources  we  have  been  able  to  uncover  lattice  geometry  effects  in 
the  entropic  lattice  Boltzmann  algorithm  that  had  not  been  expected  from  lower  grid 
resolution  runs.  In  the  entropic  formulation,  one  is  working  with  a  generalized  BGK 
collision  operator  that  has  within  it  the  germs  of  detailed  balance.  Thus,  the 
unconditionally  stable  algorithm  is  achieved  with  a  variable  transport  coefficient,  not 
unlike  Large  Eddy  Simulations  (LES)  in  CFD.  Indeed,  we  have  explored  this  connection 
in  some  detail  but  will  report  those  findings  elsewhere  due  to  space  limitations  here. 
Another  unexpected  result  unearthed  by  the  CAP  runs  was  the  dependence  of  the  ELB  on 
the  Mach  number.  A  low  Mach  number  expansion  has  to  be  performed  to  analytically 
evaluate  the  Lagrange  multipliers  arising  in  the  extremization  of  the  M-function  subject  to 
local  collisional  constraints.  We  have  found  that  the  Qi  5-bit  model  is  less  sensitive  to  the 
flow  Mach  number  than  the  Q27-bit  model.  Another  somewhat  unexpected  finding  was 
the  importance  of  maintaining  the  distribution  function  correlations  in  the  mesoscopic 
description.  To  perform  the  long-time  I  600-  grid  runs  we  needed  to  perform 
continuation  runs.  In  the  early  stages  of  CAP  we  tried  to  minimize  the  amount  of  i/o  read- 
out/read-in  and  to  reconstruct  the  relaxation  distribution  function  from  its  moments  rather 
than  keeping  the  full  correlation  information.  While  this  did  not  affect  the  energy  decay, 
there  were  significant  discontinuities  introduced  into  the  enstrophy  and  higher  energy 
spectral  moments. 

The  parallelization  strength  of  ELB  arises  from  the  modeling  of  the 
macroscopic  nonlinear  derivatives  by  local  moments.  Chapman-Enskog  asymptotics  will 
then,  on  projecting  back  into  physical  space,  yield  these  nonlinear  derivatives.  Indeed, 
this  will  allow  ideally  parallelized  Smagorinsky  type  LES  to  be  modeled  by  LB  methods 
and  in  LB-MHD  algorithms  enforce  automatically  V  B  =  0  without  the  recourse  to 
expensive  divergence  cleaning  algorithms. 

The  interconnection  between  quantum  algorithms  that  can  run  on  quantum  (and 
classical)  computers  and  ELB  (that  can  only  run  on  classical  computers)  has  been 
outlined  as  well  as  a  new  morphology  of  free  shear  turbulence  and  the  onset  of  laminar- 
to-turbulence  transition.  The  analogy  between 

Order-disorder  phase  transition  (Lattice)  I  sing  Model 

laminar-turbulence  fluid  transition  Entropic  Lattice  Boltzmaim  Model  will  be 
being  strongly  pursued  in  future  proposals. 
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During  the  3  years  of  this  grant,  we  have  continued  our  collaboration  with  Jeffrey  Yepez 
(AFRL,  Hancom  Field)  and  Linda  Vahala  (ODU)  on  both  quantum  and  entropic  lattice 


algorithms  for  the  solut  ion  of  nonlinear  physics  problems.  Because  of  the  extreme 
scalability  of  the  algorithms  that  we  have  been  developing,  we  were  chosen  for  CAP- 
Phase  I  for  the  new  IBM-P5+  supercomputer  (Babbage)  at  NAVO  MSRC.  Using  the  full 
2912  processors  available,  we  achieved  6.3  TF lops/s  sustained  performance  -  an 
excellent  performance  seeing  that  the  maximum  sustained  Flop-rate  is  just  over  20 
TFlops/s.  The  scaling  achieved  by  our  entropic  codes  is  near  perfect  for  these  algorithms 
-  as  seen  by  the  Figure  below. 
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As  a  result,  we  were  chosen  to  participate  in  CAP-Phase  il  and  presented  these  results  at 
the  DoD-UGC  2007  meeting  in  Pittsburgh. 

What  is  very  interesting  is  the  analogy  between  the  detailed  balance  quantum 
lattice  algorithms  and  entropic  lattice  Boltzmann  algorithms. 

At  each  space-time  grid  point  (x,/)  in  lattice  algorithms,  the  excited  state  of  a  qubit  |<j^ 
encodes  the  probability  f  of  the  existence  of  a  mesoparticle  moving  with  discrete  lattice 
velocity  =  Ax^  /  At ,  Ax^are  the  lattice  vector  links,  with  q  =  1,2, ....  Q,  where  Q  is 

the  number  of  qubits  at  each  spatial  node.  The  particle  momentum  is  determined  from  a 
suitably  chosen  qubit-qubit  interaction  Hamiltonian  H‘  while  the  spatial  location  arises 
from  the  free-streaming  Hamiltonian  —iti^ cq  V.  All  the  particle-particle  interactions 

? 

generated  by  H'  (from  2-body  up  to  Q-body  interactions)  can  be  mapped  onto  a  local 
collision  operator  at  x.  In  particular,  for  type-11  quantum  algorithms,  the 


quantum  entanglement  is  localized  to  those  Q-qubits  at  (x,r)  and  then  this  entanglement 
is  spread  throughout  the  lattice  by  unitary  streaming3,4: 

4  (x'r)  =  4M  +  nflU-4)  *  /,(1+Av+A,)=//(x’')  ■  (l) 

Here  /^is  the  incoming  probability  and  f  '  the  outgoing  probability.  In  the  classical 
limit,  there  exists  a  fundamental  discrete  entropy  function 1,2,5 

«(/;.../„) /»-,).  (2) 


where  the  normalized  weights  =1 


are  determined  self-consistently.  The 


collision  operator  in  Eq.  (1)  is  determined  so  that  one  remains  on  a  constant  entropy 
surface 


(3) 


Eqs.  (I)-(3)  constitute  the  basics  of  the  detailed-balance  lattice  algorithms  for  fluid 
turbulence  that  are  ideal  for  parallel  (both  classical  and  quantum)  supercomputers. 

In  the  Q-dimensional  velocity  space,  the  relaxation  distribution  function  f*q  is 
determined  analytically  by  extremizing  the  H-function  subject  to  the  local  coltisional 
constraints  of  conservation  of  probability  and  probability  flux.  f*q,  considered  as  a 
vector,  is  the  bisector  of  the  difference  between  the  incoming  and  outgoing  kinetic 


vectors  in  the  inviscid  limit  lim^^s>0  a  /  2r  =  2 : 


'It 

f  =  _  —  o  f'  -  fet>  + 


i-3l 

a 


a 


(4) 


Eliminating  fl^and  /' from  Eqs.  (4)  and  (1)  one  obtains  the  lattice  Boltzmann  (LB) 
equation 

/?(x  +  Ax?,/  +  A/)=/9(x,r)  +  ^[/;’(x,r)-4(x,r)]  ,  q  =  \....Q  (5) 

This  is  basically  the  entropic  LB1,2  with  the  BGK  collisional  relaxation  parameters 
«(x,/)/2rand  ftq  determined  from  Eqs.  (2)  and  (3).  In  the  Chapman-Enksog  limit, 

(Ax  — >  0  ,  A/  — >  0)—  and  identifying  the  density  and  momentum  moments  2  /,=/>• 
Z,cqfq  =  P  u  "  one  recovers  the  quasi-incompressible  Navier-Stokes  equation  with 


effective  viscosity:  =  — 


4r 


a(x,r) 


-  I 


molecular  viscosity:  =  —  ( 2  r  —  i )  ,  r>0.5 


(6) 


To  avoid  discrete  lattice  geometry  effects  polluting  the  turbulence  simulations,  one  is 
restricted  to  certain  Q's  on  a  cubic  lattice.  In  particular  it  can  be  shown  that  on  a  unit 
cubic  lattice,  the  lowest  order  kinetic  velocity  models  are 

Q15:  rest  velocity,  speed  I  (6  velocities),  speed  4i  (8  velocities)  -  i,e.,  Q  =  15 

Q19:  rest  velocity,  speed  I  (6  velocities),  speed  %/I  (12  velocities)  -  i.e.,  Q=  19 

Q27:  rest  velocity,  speed  1  (6),  speed  42  (12),  and  speed  4l  (8)  -  i.e.,  Q=  27  (7) 

Because  detailed  balance  is  in-built  into  the  entropic  LB  algorithm  [see  Eq.  (3)],  the 
scheme  is  unconditionally  stable  for  arbitrary  large  Reynolds  numbers.  Re  =  U0L  /  2 . 

In  Table  below  we  show  the  wallclock  time  and  average  performance  of  the  various  ELB 
models  for  the  full  2912  PEs  available  for  2000  LB  time  steps.  The  Q27  model,  based  on 
the  27  kinetic  streaming  vectors,  is  the  most  memory  intensive  (about  1  KB/grid  point) 
and  requires  a  wallclock  time  which  is  over  1.5  times  that  required  by  the  Q15  model 
(which  requires  just  0.5  KB/grid  point). 


#PEs 

GRID 

MODEL 

WALLCLOCK  (s) 

G  Flop  s/s  per  PE 

2912 

cal950J 

ELB-Q27 

7  554.7 

2.17 

2912 

cal9503 

ELB-Q19 

5  602.7 

2.24 

2912 

cal950J 

ELB-Q15 

4  798.4 

2.05 

Table  1  :  GFlops/s  per  CPU for  2912  CPUs  for  2000  time  steps  for  the  3  ELB-codes. 


For  CAP- Phase  II  we  wished  to  investigate  the  role  of  the  underlying  kinetic  lattice 
symmetry  on  Navier-Stokes  turbulence,  since  all  three  ELB-algorithms  recover  the 
Navier-Stokes  equations  to  leading  order  in  the  Chapman-Enskog  expansion.  This  is 
particularly  important  since  on  small  grids  (e.g.,  5123)  and  low  molecular  viscosities 

(H0  =  2  x  10^* )  we4  had  found  very  minor  differences  in  the  simulation  results  from  the 

Q27,  Q19  and  Q15  models.  With  2048  PEs  available  for  24  hour  shifts,  the  maximal 
spatial  grid  for  the  Q27  algorithm  was  16003.  All  3  models  were  run  with  the  same  base 

parameters  :  uQ  =  0.035  , fl0  =  5  x  10^  on  the  16003-grid  (i.e.,  with  a  base 


Re  =  uqL  /  2jrp.0  =  18,000  and  computational  resolution/grid  spacing  Re3  4/  L  =*  1 )  for  a 
Kida  initial  velocity  profile6  with  delta- function  energy  spectra. 

In  Fig.  2  we  plot  the  normalized  kinetic  energy  <|u(x,/|  >/<|u(x,o)  >,  the 
normalized  enstrophy  <|(o(x,/)|  >/<|co(x,o)  >,  the  palinstrophy  2  P(t)  = 

<  |  V  x  co|  > ,  where  the  vorticity  (0  =  V  x  u  ,  and  <  ..  >  represent  volume  average  over 

the  periodic  domain.  The  ELB-dissipation  rate  is  defined  by  =  2jU(^r(/]  (s..S^j, 

where  S,  is  the  usual  rate  of  strain  tensor  and  the  effective  relaxation  rate  (to  make  an 
analogy  with  standard  LB  algorithms) 


Fig.  2  The  evolution  of  the  normalized  (a)  kinetic  energy  K(t)/K(0),  (b)  enstrophy  Q(f)/ti(o)  .(c) 


palinstrophy  P(t)  (d)  dissipation  rate  ^(f).  (e)  T^for  the  Q27,  QI9  and  Q1S  algorithms  on  !60(f  -grid 
with  (J)  normalized  enstrophy  from  a  5  !  23  -simulation  at  somewhat  lower  molecular  viscosity. 

IAej,  (t)  =  (<  4t  /  a(x,f)  >  -  lJ/6.  Clearly  the  Q19-results  significantly  deviates  even 

qualitatively  from  the  Q27-  and  Q  15-results,  while  there  is  strong  quantitative  agreement 
between  the  Q27-  and  Q15-  models  (up  to  a  simple  rescaling).  This  contrasts  strongly 
with  a  low-resolution  grid  run  on  5 12’  at  a  somewhat  lower  molecular  viscosity,  see  Fig. 
2(f)-  It  appears  that  these  differences  arise  from  the  Newton-Raphson  root  finder  that 
determines  at  each  grid  point  and  at  each  time  the  function  that  enforces  detailed 

balance  on  the  constant  en tropic  surface,  Eq.  (3).  These  functions  cr(x,/j  seem  to  be 

much  more  lattice-dependent,  i.e.,  whether  Q27,  Q19  or  Q15,  than  would  have  been 
gleaned  from  small  grid  runs. 

In  Fig.  3  we  plot  the  development  of  the  longitudinal  and  transverse  ID  energy  spectra 


X  Mk>')  ’  Elrans(kxj)  X  f 'V ( ^ ^ ) 


2 


(8) 


for  the  initial  K.ida  velocity  profile6  with  initial  delta  function  spectra 

■ and  -*,[%- *)+%-«)]  <9> 
While  the  terabytes  of  data  from  the  early  stages  (t  <  28  K)  of  the  Q27-run  are  being 
retrieved  and  analyzed,  some  of  the  data  from  the  t  >  28  K  has  been  analyzed.  The  energy 

spectra  approximately  obey  a  /Ts'3  Kolmogorov  law,  with  a  slight  upturn  at  the  very  large 
kx  in  E.  ,  indicating  that  the  run  is  slightly  unresolved  at  these  scales 

E  (*, .  tl  E  (k  ,t> 

I®"*  *  c 


Fig.  3(a)  The  longitudinal  energy  spectrum,  (b)  the  transverse  energy  spectrum  for  t  >  28  K. 

The  probability  distribution  functions  (pdfs)  for  the  velocity  and  vorticity  components  are  shown  in 
Fig.  4.  The  velocity  field  is  basically  Gaussian  -  but  with  tails  that  are  substantially  higher  than  a 

Guassian.  These  tails  die  out  with  time  as  seen  by  the  plot  of  /’[vj  at  t  =  29K  (Fig.  4a)  and  at  t  = 

41 K.  (Fig.  4b).  The  pdfs  for  the  other  velocity  components  have  very  similar  behavior.  On  the 
other  hand,  the  vorticity  pdf  is  well  fitted  by  an  exponential  pdf.  This  is  indicative  of  intermittency 
in  the  turbulence: 


pdf  for  the  vorticity  component  0)fat  t  =  40K  is  shown  in  (c).  fitted  to  an  exponential  pdf. 


Turbulence  Morphology  for  Free  Shear  Turbulence 

We  have  started  to  examine  a  somewhat  new  turbulence  morphology  of  free  shear  turbulence 
and  the  correlation  between  the  onset  of  turbulence  in  a  laminar- to-turbulence  transition  and  the 
order-disorder  phase  transition  in  ferromagnetism*  Just  as  Ising  lattice  models  are  fundamental  to 
understanding  critical  phenomena,  kinetic  lattice  gas  models  that  we  are  pursuing  could  have  a 
similar  impact*  We  now  give  some  preliminary  results  on  the  turbulence  morphology  from  5 123 
grid  runs*  The  morphology  can  be  broken  down  into  3  main  stages.  Fig.  5.  Stage  I  occurs  in  the 

initial  time  interval  0  <  t  <  3.2K  with  the  enstrophy  increases  exponentially.  Independent  of 

the  viscosity.  The  enstrophy  curve  is  plotted  in  Fig,  5  with  the  integer  dots  ‘  1  \  fc2’  ...  *7T  -  and 
these  integers  correspond  to  the  isosurfaces  of  constant  vorticity  at  t  =  IK,  t  =  2K  ***  t  =  7K  in 
Fig.  5*  The  color  coding  is  based  on  the  value  of  u  (0  ;  grey  corresponds  to  U  *  <i)  =  -,  blue  for 
u  ■  CO  =  +  1  and  red  for  u  ay  -  -1 .  In  this  initial  stage,  the  isosurfaces  of  vorticity  are  stretched 
with  a  sharp  rise  in  dQ  /  dt  (the  sharply  rising  curve  above  the  enstrophy  curve  in  Stage  1).  In 
Stage  2,  for  time  3.2K  <  t  <  9K,  shown  shaded  in  Fig.  5,  there  is  large  scale  anisotropic 
turbulence  with  intermittcncy*  In  this  shaded  region  d£l  /  dt  becomes  jagged  and  predominantly 
is  decreasing  in  large  spurts  with  intermediate  avalanches  occurring  at  t  =  5. IK,  and  6.75K 
(vertical  red  lines  in  Stage  2  of  Fig*  5)*  Stage  3,  for  9K  <  t  <  14K,  is  the  inertial  subrange  with 

eventual  exponential  decay  of  the  enstrophy  (sec  curve  fitted  red  line  that  fits  well  for  t  > 

10K).  In  this  Stage  3,  we  see  the  onset  of  homogeneous  isotropic  small  scale  turbulence  with 

energy  cascading  to  small  scales  leading  to  the  Kolmogorov  k~5 /3  energy  spectrum.  The 
velocity  pdf  is  Gaussian  while  the  vorticity  pdf  is  exponential  (see  the  inset  plots  in  Fig,  5). 


ft* 


Fig.  S  Surfaces  of  cons  ram  vorticttyfrom  r  =  OK  to  t  =  7K,  color 
coded  by  u  ■  to  [grey  =  0.  blue  =  -  J ,  red  =  -1]  Stage  1:  0  <  t  < 
3.2K.  Vortex  stretching,  with  an  exponential  growth  in  the 
ens trophy  ft(r)-  the  1  ....7  corresponds  to  vorticity  isosurface 

plots.  Stage  2:  3.2K  <t<  9K  Large  scale  anisotropic 
turbulence  and  intermitienn'  occur.  The  major  breaking  points 
are  at  3. 2K.  5.  IK  and  6.SK  (red  lines  in  ensrrophy  plot/.  Stage 
3:  t  >  9K  inertia!  subrange  with  homogeneous  isotropic  small 
scale  Turbulence  with  k'!  *  Kolmogorov  energy' 
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Concluding  Remarks 

Using  CAP  resources  we  have  been  able  to  uncover  lattice  geometry  effects  in 
the  entropic  lattice  Boltzmann  algorithm  that  had  not  been  expected  from  lower  grid 
resolution  runs.  In  the  entropic  formulation,  one  is  working  with  a  generalized  BGK 
collision  operator  that  has  within  it  the  germs  of  detailed  balance.  Thus,  the 
unconditionally  stable  algorithm  is  achieved  with  a  variable  transport  coefficient,  not 
unlike  Large  Eddy  Simulations  (LES)  in  CFD.  Indeed,  we  have  explored  this  connection 
in  some  detail  but  will  report  those  findings  elsewhere  due  to  space  limitations  here. 
Another  unexpected  result  unearthed  by  the  CAP  runs  was  the  dependence  of  the  ELB  on 
the  Mach  number.  A  low  Mach  number  expansion  has  to  be  performed  to  analytically 
evaluate  the  Lagrange  multipliers  arising  in  the  extremization  of  the  H-function  subject  to 
local  collisional  constraints.  We  have  found  that  the  Q  15-bit  model  is  less  sensitive  to 
the  flow  Mach  number  than  the  Q27-bit  model.  Another  somewhat  unexpected  finding 
was  the  importance  of  maintaining  the  distribution  function  correlations  in  the 
mesoscopic  description.  To  perform  the  long-time  16003  grid  runs  we  needed  to  perform 
continuation  runs.  In  the  early  stages  of  CAP  we  tried  to  minimize  the  amount  of  i/o 
read-out/read-in  and  to  reconstruct  the  relaxation  distribution  function  from  its  moments 
rather  than  keeping  the  full  correlation  information.  While  this  did  not  affect  the  energy 
decay,  there  were  significant  discontinuities  introduced  into  the  enstrophy  and  higher 
energy  spectral  moments. 

The  parallelization  strength  of  ELB  arises  from  the  modeling  of  the 
macroscopic  nonlinear  derivatives  by  local  moments.  Chapman-Enskog  asymptotics  will 
then,  on  projecting  back  into  physical  space,  yield  these  nonlinear  derivatives.  Indeed, 
this  will  allow  ideally  parallelized  Smagorinsky  type  LES  to  be  modeled  by  LB  methods 
and  in  LB-MHD  algorithms  enforce  automatically  V  B  =  0without  the  recourse  to 
expensive  divergence  cleaning  algorithms. 

The  interconnection  between  quantum  algorithms  that  can  run  on  quantum  (and 
classical)  computers  and  ELB  (that  can  only  run  on  classical  computers)  has  been 
outlined  as  well  as  a  new  morphology  of  free  shear  turbulence  and  the  onset  of  laminar- 
to-turbulence  transition.  The  analogy  between 

Order-disorder  phase  transition  «-»  (Lattice)  Ising  Model 

laminar-turbulence  fluid  transition  «  Entropic  Lattice  Boltzmann  Model 
will  be  being  strongly  pursued  in  future  proposals. 
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